Genetic characterization and quantitative trait relationship using multivariate techniques reveal diversity among tomato germplasms

Abstract Tomato accessions collected from different sources were evaluated to study their diversity, genotype–traits association, as well as pinpoint most selective trait(s) in a controlled environment in Jimma, Ethiopia. The two terms pot experiments were carried out in randomized complete block design with three replications. The genotype–trait (GT) biplot revealed high percentage variability above 70% in related growth traits for the first and second principal components (PC) summed up, in the two trials, whereas related floral and fruit traits association indicated medium to high (55%–65%) total explained variations in both seasons. It further showed that ‘wild parent’, ‘CLN2498D’, ‘CLN2498F’, ‘UC Dan India’, ‘Ruma’, ‘PT4722A’, ‘CLN2679F’, ‘CLN2585C’ and ‘CLN2585D’ were the best performers in most of the related growth, floral, and fruit traits in those seasons. Principal component analysis showed that traits, such as plant height, number of branches, leaves, nodes, internodes, stem girth, style length, stigma length and diameter, flower length and width, number of flowers per truss, number of fruits per truss, and fruit weight per plant, in the first dimension were positively related to yield and consistent with high loading factors in both seasons and could be underpinned highly important in breeding for increased fruit yield. Clustering and its comparison of means showed that ‘CLN2498D’, ‘PT4722A’, ‘Ruma’, ‘Tropimech’, and ‘UC Dan India’ of cluster I in both trials expressed the best traits including related growth, floral, and fruit traits. Therefore, selection for any trait would favor accessions in this cluster.


| INTRODUC TI ON
Tomato (Solanum lycopersicum L.) is one of the most relevant productions of the agro-horticultural sector in not just Ethiopia but also around the world (El-Mansy et al., 2021;Tolessa et al., 2013;Worku & Sahe, 2018) for its edible fruits. This nutritious fruit is a valuable vegetable which could be taken raw as salad, processed as puree, ketchup, juice, or powder, and cooked as tomato sauce or soup which has been reported to be an effective remediation for sufferers of constipation (Alda et al., 2009;Rehman et al., 2019). Tomato supplies phytochemicals such as: β-carotene, flavonoids, lycopene, vitamins, and vital minerals, which to a great degree contribute to keep deficiency diseases away from man coupled with its cash-generating ability for smallholders and medium-scale commercial farmers, being a relatively short-duration and high-yielding crop (Martí et al., 2018).
The increase in demands of fresh tomato fruits and their products has necessitated the quest for high yielding varieties among tomato farmers. However, the high yielding cultivars are insufficient to meet up with the fast-growing global fresh fruit demands (Asfaw, 2021).
The biotic and abiotic factors such as disease incidence and high humidity associated with most tomato production environments are believed to contribute greatly to the low production (Atugwu et al., 2019). Also, the few available cultivars are out of farmers' reach possibly because of high cost of high premium varietal seeds especially hybrids, alongside other factors. These could possibly be responsible for the low annual fruit yield in Ethiopia and most surrounding East African countries (Tolessa et al., 2013). Hence, there is a need for an extensive germplasm recollection including the wild relatives, characterization, selection, and exploitation of selected but unknown genetic materials. This would help to improve tomato resistance to diseases, adaptability to humid environments, and enhance fruit yield and quality for present and future gain. Evaluation would be helpful in understanding the breeding values and genetic background of the available materials. This would definitely increase tomato productivity in Ethiopia and environs. Ng (1991) earlier opined that genetic resources can only be useful to plant breeders and other plant users when they have been known through adequate characterization and evaluation. By this, breeders can investigate diversity of the species involved to consider perhaps direct introduction as cultivars, or provide useful variation in a breeding program.
Germplasm evaluation is always confronted by two major challenges. The first is genotype-by-environment interaction (GEI) for a particular trait, and the second is unfavorable correlation among influencing traits as well as trait relations of genotypes (Yan & Frégeau-Reid, 2018). El-Aziz et al. (2016) reported that selection of tomato genotypes with superior performance in different spatial environments has been studied, but consistency in performance on certain quantitative traits under temporal environment has not received extensive outlook.
Identification of the major multiplex traits which contribute more to higher yield in any crop species during germplasm evaluation is a fundamental objective of any breeding program. Several morphological traits affect tomato fruit yield either positively or negatively. Through a careful examination of the contribution of these component traits, it is easier to concentrate efforts on the traits with higher influence on the primary trait in the future selection process. Through genotype-trait (GT) biplot which applies genotype and genotype-by-environment interaction (GGE) biplot technique as proposed by Yan and Rajcan (2002), trait association, and genotype-trait(s)-specific relations have been visualized graphically. GT biplot has helped breeders to investigate data of various traits at once and this can seriously improve indirect selection of parental lines, unlike most univariate tools which explore traits in a data set separately.
The application of GGE biplot to study GT correlation matrix has been witnessed in some crops species such as soybean (Yan & Rajcan, 2002), buckwheat (Joshi, 2012;Joshi & Okuno, 2010), linseed (Soto-Cerda et al., 2014), oats (Martin et al., 2014), tartary sunflower (De C. Leite & de Oliveira, 2015), forage sorghum (Aruna et al., 2016), Ethiopian white lupin (Atnaf et al., 2017), durum wheat (Kendal, 2019;Mohammadi, 2019), Sesame (Boureima & Yaou, 2019), and maize (Munawar et al., 2013;Shojaei et al., 2020). Although many reports have implicated the application of GT analysis to discover superior accessions in many crops species, there is paucity of information on relationship between genotypes and the quantitative traits of related growth, floral, and fruit simultaneously in tomato germplasm especially in a controlled environment. Recently, the GT biplot technique was used to assess the adaptability of advanced generations of wild and cultivated tomato crosses under open field condition (Atugwu et al., 2019), and greenhouse tomato germplasm characterization for NaCl tolerance only at the seedling stage ( Rehman et al., 2019).
Cluster analysis has often been used recently as a genetic tool to give spotlights on the quality of relatedness of the genetic materials based on the traits under consideration. It separates the accessions into dissimilar groups based on Euclidian distance (Subramanian & Subbaraman, 2010) for easy selection. Principal component analysis (PCA) shows the amount of contributions of the traits -whether so much, a few, or zero contribution -to the observed variation witnessed among accessions. It suggests which trait expresses higher variability based on its magnitude and qualifies such trait(s) as the most selective among accessions (Ene et al., 2016b). In the GT biplot, the first two principal components (Dimension 1 -primary effects; and Dimension 2 -secondary effects) from the data are plotted. If they cannot provide complete percentage of explained variances in the data, other dimensions may be X-rayed using scree plot or related output (Aruna et al., 2016).
In the present investigation, GT biplot was used to select the tomato accessions using multiple-trait data. The cultivated tomato germplasm including a wild parent of Solanum pimpinellifolium L. species collected from different agro-ecological sources was screened together for two seasons under a controlled environment. The main objectives of the study were (1) to evaluate tomato accessions using cluster analysis and PCA as genetic tools to check relatedness among accessions and most discriminating trait(s), respectively; (2) to understand trait associations in tomato germplasm using GT biplot analysis; and (3) to identify high-and low-performing accessions in the studied traits that could warrant selection for the development of interspecific breeding/mapping populations, which could possibly go in for a QTL linkage mapping. The mean minimum and maximum temperature are about 11.4°C and 28°C, respectively, while the average annual rainfall is a little above 1,500 mm which occurs from April to October. The relative humidity is 37.92% and 94.4% as minimum and maximum, respectively (BPEDORS, 2000).

| Plant material, site description, and layout
Seeds were sowed in plastic trays filled with a mixture of carbonated rice husks and a bit of topsoil to raise the seedlings of all the accessions in the greenhouse. The nursery routine practices were observed which aided the production of vibrant seedlings that were transplanted on the 26th day after germination into the 28cm experimental pots well arranged in fixed position laid out in a three replicate randomized complete block design under a greenhouse environmental condition. The pots were filled with well-mixed organically enriched compost together with topsoil based on the required standard, as recommended by Agong et al. (1997).

| Agronomic practices
All the standard horticultural practices required for a greenhouse tomato production such as irrigation, weed picking, fungicide (Ridomil-Mancozeb and Metalaxyl-M), insecticide (Karate-Lambda-Cyhalothrin 5% EC), and fertilizer (DAP-Di ammonium Phosphate), as recommended in the production labels to curtail fungi and insect attack and maintain healthy growth, were applied. Observations were carried out and records were taken on five randomly selected plants per accession for each replicate.

| Recorded observations
Data were recorded on the following related growth, floral, and fruit traits.

| Related growth traits
These traits were recorded at 9 weeks after transplanting. Plant height (cm) was taken using meter tape from the plant base to the shoot tip, leaf length (cm) was taken using meter tape from the point of attachment to the petiole to the leaf tip, and leaf width (cm) was taken using meter tape at the widest point of the leaf. Leaf area (cm 2 ) was calculated using the formula, X = 0.5 × L × W, according to Carmassi et al. (2007a) (where X = leaf area, 0.5 = constant , L = leaf length, and W = leaf width), number of leaves, number of branches, number of nodes, and number of internodes were all counted and stem girth (cm) measured by vernier caliper.

| Related floral traits
Related floral traits include number of days to first anthesis, number of days to 50% anthesis, number of flowers per truss, total number of flowers per plant, number of aborted flowers per plant, flower length (cm), and flower width (cm). Others included: style length, style diameter, stigma length, and stigma diameter, all in centimeter using a moticam with Motic Image Plus 2.0 software. Ovary length and ovary diameter were taken using ocular micrometer, also in centimeter. This was done after harvesting a given flower from sampled plants and taken to laboratory and cut longitudinally to expose the ovary and other floral parts mentioned. The ovary area (cm 2 ) and ovary perimeter (cm) were calculated uniformly using the formula for area (πr 2 ) and circumference (2πr) of a circle, respectively, it being circular in shape according to Nnungu and Uguru (2014

| Data statistical analysis
All statistical analysis was done separately for each evaluation season.

| Genotype-by-trait biplot analysis
The collected data of the abovementioned quantitative traits were subjected to ANOVA as described by Steel et al. (1997) to obtain significant genotypic differences. Data were further analyzed by the multivariate technique 'genotype-by-trait biplots', an option of GGE biplot software version 6.3 (Yan, 2001) on separate data from each season using 'Scaling 1'. For phenotypic correlations among traits according to Yan and Tinker (2005), trait-focused singular value partitioning (SVP = 2) was employed while a tester-centered (centering 2) GGE biplot was generated. Here, traits were regarded as 'tester' when using 'relation among testers' option. The. 'which-won-where' polygon option was used to identify which accession was the best in a given set of traits, and hence, identify the super accession(s).

| Trait associations through genotype-by-trait biplot
The which-won-where view of the GT biplot was used for the study under traits association through genotype-by-trait biplot analysis. The 'which is best for what' view of the GT biplot showing the visualization of the relationships among the related growth traits across tomato accessions during the first and second evaluation trials is presented in Figures 1 and 2, respectively. This analysis was to help make a comparison among tomato accessions on the basis of multiple related growth traits numbering 7 and to identify superior accession(s) for the given trait(s). The total variations explained by the PC1 and PC2 for the related growth traits in the first evaluation biplot were 84.1% while the second planting showed 86.8%.
The first evaluation trial ( Figure 1) showed that the 'wild parent' had high performance for number of leaves, number of branches, plant height, number of nodes, and number of internodes. 'CLN2498D' and 'CLN2498F' were better for leaf area and stem girth followed The GT biplot analysis indicated high percentage variability above 70% in related growth traits correlation for PC1 and PC2 put together in the two season trials, whereas GT biplot for both related floral and fruit traits association showed medium-to-high (55%-65%) total explained variations in both seasons. This suggests high and medium-to-high variability in the performance of the tomato accessions for the related growth and related floral-fruit traits, respectively. According to Yan and Rajcan (2002), the GT biplot analysis is a standard statistical tool that helps breeders to visualize the relationships that exist among traits, and characterize accessions based on variability that exists on multiplex traits. This is in order to identify those accessions that were superperforming in particular trait(s    in the present study, the first five principal components could explain 78.08% and 77.70% of percentage cumulative variance for the first and second seasons, respectively. It was observed that the first two components appeared in smaller magnitude of total variation

| Principal component analysis
with Eigen values higher than 2.0. In another report, Rai et al. (2017) stated that 95% of the total variation found among 56 tomato gen-  Akinwale et al. (2014) noted with concern in their review article that no study has been able to determine when the total percentage of variation accounted for by a biplot should be considered too small or insignificant to give a reasonable judgment. According to them, nevertheless, general assumptions have projected cumulative proportion of variation less than 40% with higher Eigen values probably greater than 1.0 or 2.0 as being too small to make inference.
Going by this argument, from the present study, the first component produced total variation less than 40% with higher Eigen values in both seasons, and as a result, was not enough to make conclusion, hence, the need to consider other dimensions. Pradhan et al. (2011) reported that Eigen values are derivatives of principal components, which are used to specify the relative discriminative power of the axes and their associated traits.

| Hierarchical cluster analysis
Cluster analysis is used for the identification of different clusters based on the grouping patterns of the accessions evaluated (Nankar et al., 2020). It has demonstrated effective classification of genetic materials which of course is significantly helpful in conserving their biodiversity, and, hence, utilization in crop improvement program (Shukla et al., 2010).
According to the dendrogram, cluster analysis grouped 35 tomato accessions in both the first and second evaluations into three clusters as shown in Figures 11 and 12 as having performed higher than the population means. The same result was applicable with regards to traits that expressed tomato shelf-life performance, such as number of days to first, 50%, and 100% fruit spoilage, as clusters I and II took leadership in those traits as well. In the second season of tomato evaluation (Table 3), the cluster means followed the similar pattern as in the first trial except that cluster III had values which were higher than the population means. However, cluster III was still lower than those of cluster I for plant height and fruit pericarp thickness. Cluster II showed a decline, slightly lower than the population mean for fruit diameter unlike the result obtained in the first outing, although it was also with slight increase higher than the population mean.  cucumber, and Nankar et al. (2020) in tomato have reported the same result. The superior related vegetative, floral, and fruit traits performance of the accessions aligned in cluster I over clusters II and III in both seasons indicates gainful exploitation in tomato improvement programs. From the present investigation, it could be stated that cluster analysis obviously can be considered as an effective tool to assort tomato accessions based on their performance relatedness for the traits studied. Feng-Mei et al. (2006) and Iqbal et al. (2014) reported cluster analysis as having provided authentic foundation for selection of base materials to outline future improvement plans in tomato. Nevertheless, the authors in addition mentioned that while the selection of base material is being made, genetic barriers must be handled, as well as choosing appropriate breeding methods to obtain anticipated genetic improvements for traits desired. Cluster analysis had been utilized extensively in tomato germplasm improvement studies on different quantitative and qualitative traits in various parts of the world (Iqbal et al., 2014;Kiran et al., 2017;Mishra et al., 2018;Nankar et al., 2020;Prakash & Vijay, 2017;Rehman et al., 2019).

| CON CLUS IONS
Multivariate analysis is an efficient technique to quantify diversity among germplasm due to trait variability. Generally, multivariate analysis gave perception of tomato accessions separation into dif-

ACK N OWLED G EM ENTS
We are grateful to the Ethiopian Agricultural Research Institute-Melkassa Agricultural Research Center, Ethiopia; National Horticultural Research Institute, Ibadan, Nigeria, and University of Nigeria, Nsukka, Nigeria, for providing us with the planting materials including the wild relative accession for this study.

CO N FLI C T O F I NTE R E S T
The authors declare that there are no conflict of interest.